Character-Level Interaction in Multimodal Computer-Assisted Transcription of Text Images

نویسندگان

  • Daniel Martín-Albo
  • Verónica Romero
  • Alejandro Héctor Toselli
  • Enrique Vidal
چکیده

To date, automatic handwriting text recognition systems are far from being perfect and heavy human intervention is often required to check and correct the results of such systems. As an alternative, an interactive framework that integrates the human knowledge into the transcription process has been presented in previous works. In this work, multimodal interaction at character-level is studied. Until now, multimodal interaction had been studied only at whole-word level. However, character-level pen-stroke interactions may lead to more ergonomic and friendly interfaces. Empirical tests show that this approach can save significant amounts of user effort with respect to both fully manual transcription and non-interactive post-editing correction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Web-Based Demo to Interactive Multimodal Transcription of Historic Text Images

Paleography experts spend many hours transcribing historic documents, and state-of-the-art handwritten text recognition systems are not suitable for performing this task automatically. In this paper we present the modifications on a previously developed interactive framework for transcription of handwritten text. This system, rather than full automation, aimed at assisting the user with the rec...

متن کامل

Challenges in Transcribing Multimodal Data: a Case Study

Computer-mediated communication (CMC) once meant principally text-based communication mediated by computers, but rapid technological advances in recent years have heralded an era of multimodal communication with a growing emphasis on audio and video synchronous interaction. As CMC, in all its variants (text chats, video chats, forums, blogs, SMS, etc.), has become normalized practice in persona...

متن کامل

Order embeddings and character-level convolutions for multimodal alignment

With the novel and fast advances in the area of deep neural networks, several challenging image-based tasks have been recently approached by researchers in pattern recognition and computer vision. In this paper, we address one of these tasks, which is to match image content with natural language descriptions, sometimes referred as multimodal content retrieval. Such a task is particularly challe...

متن کامل

Preprocessing and Feature Extraction Techniques for Multimodal Interactive Transcription of Text Images

To date, automatic handwriting recognition systems are far from being perfect and heavy human intervention is often required to check and correct the results of such systems. This “post-editing” process is both inefficient and uncomfortable to the user. An example is the transcription of historic documents: State-of-the-art handwritten text recognition technology is not suitable to perform this...

متن کامل

A Multimodal Data Mining Framework for Revealing Common Sources of Spam Images

This paper proposes a multimodal framework that clusters spam images so that ones from the same spam source/cluster are grouped together. By identifying the common sources of spam images, we can provide evidence in tracking spam gangs. For this purpose, text recognition and visual feature extraction are performed. Subsequently, a two-level clustering method is applied where images with visually...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011